perm filename NSF[AM,DBL]1 blob sn#398084 filedate 1978-11-29 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00024 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00003 00002	.DEVICE XGP
C00004 00003	.portion TITLEP
C00007 00004	.portion mainbod 
C00008 00005	.begin "abstractpag"
C00014 00006	.begin "aabstractpag"
C00017 00007	.skip to column 1 chap(PREVIEW)
C00023 00008	.chap(|INFERENCE|)
C00045 00009	.sec(Discovery in Mathematics)
C00048 00010	.subsec(Syntheses of Discoveries)
C00051 00011	.chap(|DESIGN OF THE uu-AM] PROGRAM|)
C00071 00012	.skip to column 1 chap(RESULTS)
C00090 00013	.chgfont("BDR40",b)
C00099 00014	.chap(|THE NEXT STEP:  THIS PROPOSAL|)
C00113 00015	.sec(Appropriateness of Available Resources)
C00116 00016	.sec(Budget)
C00117 00017	.spacing 0 preface 1 indent 0,0,0
C00118 00018	.begin "heuri"
C00119 00019	.begin "typscr"
C00121 00020	.begin "aabtrac"
C00122 00021	.begin "impli"
C00123 00022	.begin "vitee"
C00124 00023	.begin "ref"
C00131 00024	.every heading(,,{page})
C00132 ENDMK
C⊗;
.DEVICE XGP;
.require "ksets.dfs[a700pu00]" source!file;
.require "sbook.dfs" source;
.page frame 55 high 80 wide;
.area text lines 4 to 53;
.title area heading lines 1 to 3;
.title area footing line 55;
.chgfont(hd2font,b)
.B!FONT ← "ngi25";
.page←0;
.INSERT CONTENTS;

.fill; adjust; compact; double space; preface 2; indent 0,0,0;
.select a;
.TTY←CRLF&"  Beginning to process the text, finally. ";

.macro B816 ⊂ begin indent 8,16,0; select b; preface 0; once preface 1; ⊃
.portion TITLEP;
.begin "titlepag";
.chgfont(hd2font,b);
.nofill; preface 0; indent 0;
.once  center; select b; 
Research Proposal Submitted to the National Science Foundation



.chgfont(hd1font,b)


%bProposed Amount%a uu- $97,321. ] %b Proposed Effective Date%a uu- July 1, 1977 ] %b Proposed Duration%a uu- 24 months ]


%bTitle%a uu- The Use of Informal Rules to Guide the Search for Discoveries in Mathematics ]


%bPrincipal Investigator%a uu- Prof. Douglas B. Lenat ] %bSoc. Sec. No.%a uu- 221-30-0371 ]
          %bSubmitting institution%a uu- Carnegie-Mellon University ]
          %bDepartment%a uu- Computer Science ]
          %bAddress%a uu- Pittsburgh, Pa. 15213 ]

%bCo-Principal Investigator%a uu- Prof. Herbert A. Simon ] %bSoc. Sec. No.%a uu- 181-26-0743 ]


%bMake grant to%a uu- Carnegie-Mellon University ]

.turn on "∞→\";
.tabs  20,40,60,79; at "zzz" ⊂ "uu-∞ →\]" ⊃;



%bEndorsements:


Principal Investigator\Co-Principal Investigator\  Dept. Head\Institutional Admin. Official

%bName%a uu-Douglas B. Lenat]zzz uu-Herbert A. Simon]zzz uu-Joseph F. Traub]zzz uu- ]zzz

%bSignature%a zzz zzz zzz zzz

%bTitle%a uu-Asst. Professor]zzz uu-Professor]zzz uu-Professor]zzz uu- ]zzz

%bTelephone%a uu-412-621-2600 x183    x167]zzz uu- x152]zzz uu- x]zzz

.comment %bDate%a uu- {date}]zzz uu- {date}]zzz uu- November 1, 1976]zzz zzz ;
%bDate%a uu- ]zzz uu- ]zzz uu- ]zzz zzz

.end "titlepag";
.chgfont(B!FONT,b);
.portion mainbod; 
.chgfont(B!FONT,b);
.fill; adjust; compact; single space; preface 1; indent 0,0,0;
.select a;
.every heading(,,p.   {PAGE!});
.<<every footing(,%buu-DRAFT]: NOT FOR DISTRIBUTION%a,)>>;
.begin "abstractpag";
.fill; spacing 0; preface 1; indent 0,0,0;
.chgfont("bdr40",b);
.group skip 1;
.once center;
%bABSTRACT%a

.group skip 3;
.chgfont(B!FONT,b);

Inductive inference  tasks permeate  both science  and everyday  life
(What will  the  weather be  today? What  is  the structure  of  this
molecule?).  It  was  not until  early Artificial  Intelligence  (AI)
programs tried  -- and  failed at  -- many  %bapparently%a  deductive
tasks (translation,  theorem-proving,  speech recognition)  that  the
importance  of  inductive  inference  to  decision-making  was  fully
appreciated.  Since that time, a major thrust of AI research has been
the understanding  of plausible  (i.e., non-formal)  inference.   One
recent  method has  been to incorporate  task-specific knowledge into
context-sensitive  rules.  Large  collections of such  rules are then
gathered  together,  and  a  computer  rapidly  scans  through  them,
locating all the ones that apply in a given situation.

My research has been aimed at applying this method to an archetypical
inductive task: discovery  of new mathematical concepts. Towards this
end, hundreds  of informal rules of  thumb, called "heuristics", were
collected.  A computer program,  called "AM", which makes use of them
has carried  out  simple kinds  of  math research:   gathering  data,
generalizing  from it, formulating new  conjectures, and defining new
concepts.

The heuristic  rules communicate via an agenda  mechanism, which is a
list  of promising questions  for the program to  explore and reasons
why each  one is  plausible.  A  single question  might cause  AM  to
define a new concept, or to expand some facet of an existing concept,
or to examine some empirical data for regularities, etc.  The program
repeatedly selects  from  the agenda  the  question having  the  best
supporting reasons, and then investigates it.

Each concept is an  active, structured knowledge module.  One hundred
very  incomplete   modules  were   initially  provided.    Each   one
corresponded  to an  elementary set-theoretic  concept (e.g., union).
This provided  a  definite  but immense  "space"  which AM  began  to
explore.   AM extended  its knowledge  base, ultimately rediscovering
hundreds  of common  concepts  (e.g.,  numbers) and  theorems  (e.g.,
unique factorization), plus a few new ones.

The  main limitation  of this  approach is  that as  AM discovers new
concepts it's  %bnot%a able  to synthesize  new heuristic  rules  for
dealing effectively  with  them. The  research  proposed here  is  to
assemble a  sufficient collection  of %buu-meta-]%aheuristics  (rules
which  modify and  create  rules),  and  add  them to  AM.   Just  as
heuristics  were gleaned  by studying how  specific discoveries might
have  been  motivated,  so  %bmeta-%aheuristics  may  be  gleaned  by
studying   how  various   heuristics  might   have  originated.   The
generality of  these new meta-rules will then be  tested by having AM
explore  disparate fields of mathematics.   Since meta-rules deal not
so much with "mathematics" as with "other rules", it is reasonable to
expect  that they  will be almost  exclusively domain-independent. If
so,  they will  be of  general scientific  interest, and specifically
will be of use  to other researchers trying to automate sophisticated
inductive inference tasks.


.end "abstractpag"
.begin "aabstractpag";
.fill; spacing 0; preface 1; indent 0,0,0;
.chgfont("bdr40",b);
.group skip 1;
.once center;
%bABSTRACT%a

.group skip 3;
.chgfont(B!FONT,b);

Inductive inference  tasks permeate  both science  and everyday  life
(What will  the  weather be  today? What  is  the structure  of  this
molecule?).  Can we understand how such non-formal reasoning might be
done?  Our  research  focusses on  one archetypical  inductive  task:
discovery of new mathematical concepts. Towards this end, hundreds of
informal rules  of  thumb, called  "heuristics", were  collected.   A
computer  program, called "AM",  which makes use of  them has carried
out simple kinds of math research:  gathering data, generalizing from
it, formulating  new  conjectures, and  defining new  concepts.   The
heuristic rules communicate via  an agenda mechanism, which is a list
of  promising questions  for the program  to explore  and reasons why
each one is plausible.   Scant knowledge about one hundred elementary
set theory  concepts (each one represented  as a structured knowledge
module) was initially supplied.  This provided a definite but immense
"space"  which  AM  began to  explore.   AM  ultimately  rediscovered
hundreds  of common  concepts  (e.g.,  numbers) and  theorems  (e.g.,
unique factorization),  plus a few new ones.   The main limitation of
this approach  is that as AM discovers new  concepts it's NOT able to
synthesize new heuristic rules for dealing effectively with them. The
research proposed  here is  to assemble  a sufficient  collection  of
META-heuristics  (rules which modify  and create rules),  add them to
AM,  and then carefully  test their  generality by having  AM work in
several fields of mathematics.

.end "aabstractpag"
.skip to column 1; chap(PREVIEW)

Most scientific  research -- and surprisingly  many everyday tasks --
involve complex decision-making based on incomplete information. Such
"%binductive  inference%a"  processes   have  been  studied  both  by
Psychology  and by  Artificial Intelligence (AI).   Several recent AI
attempts  to mechanically  perform scientific  inference [1,2,3] have
been based  on  the following  methodology:  Represent  the  research
activity as a %bsearch%a through some space, then elicit from experts
a large  collection of rules of thumb they  employ to constrain their
search.    Thus  Meta-Dendral   [1]  contains  rules   used  by  mass
spectroscopists, MYCIN [2]  embodies rules contributed by physicians,
AM [3] uses hundreds of heuristics for guiding mathematicians, etc.

But  one  key  element  of  the  human  scientist  is  lacking:   his
adaptability.  He  can discover new rules,  modify his existing ones,
find  better  representations:   in short,  he  can  do  considerable
processing at a higher or "%bmeta-%a" level.

One conceptually elegant hypothesis is  that the scientist performs
such  feats by (subconsciously) following  a collection of meta-rules
of thumb. The psychologist  likes this explanation because he needn't
radically change his language [4]; the AI researcher likes it because
he  needn't radically change his  program.  He can try  to elicit the
meta-rules  from the  scientist  and  then  simply add  them  to  his
program's existing base of rules.

The proposed research is a  step in this direction. Here is the basic
argument:

.begin; preface 0; indent 4,7,5;

1. A computer program,  "AM", successfully uses hundreds of heuristic
rules to  guide it in defining new  mathematics concepts and noticing
relationships between them.

2. The  main limitation of AM is that,  as the newly-defined concepts
grow  further  and  further  away  from  the  initial  ones,  no  new
heuristics  are  created  for  dealing  effectively  with  those  new
concepts.

3.  The  same  technique  which led  to  the  collection of  the  250
heuristics  AM possesses  can  be  used to  collect  meta-heuristics:
rules of  thumb for modifying and  discovering heuristic rules.  This
extraction process has already begun.

4. The  meta-rules  are very  general. They  deal  not so  much  with
mathematics  as with  "rules".   This  generality can  be  tested  by
observing AM explore varied fields of mathematics.

5. If a body of general, domain-independent meta-rules were codified,
it  would  be  of interest  to  psychologists  trying  to  understand
scientific inference, and would be of use to AI researchers trying to
mechanize such activities.

.end

The proposal begins with a section on inductive inference in general,
scientific inference, and finally  the role of plausible reasoning in
mathematics. The next section  deals with the AM program, after first
reviewing the relevant work  by other researchers. Section 4 contains
a summary of the results to date obtained from AM. They point clearly
to the continuation which is being herein proposed. The next section
discusses the significance of this work, its feasibility, and its
budget. Finally, several appendices are present to offer
supplemental materials: how concepts are represented, some sample
heuristic rules, a trace of an actual AM run, a "prettified" condensed
trace, some speculations about long-range impacts of this work, the
vitae and publication lists of the co-principal investigators, and
the numbered list of references cited within this proposal.
.chap(|INFERENCE|);

This section provides background  material on the relevant aspects of
inductive inference,  scientific  reasoning, and  mathematics  theory
formation. The  last of  these forms  the task  domain in  which  the
proposed research will be carried out.

.sec(Inductive Inference)

Many  everyday tasks  which we  refer to  as requiring "intelligence"
involve  the   making  of  decisions  in   the  absence  of  complete
information:   While driving  to work in  the morning,  what route is
best at that particular hour?  What did that announcer say? Is that a
police car behind me?

In each case, how is a plausible solution obtained? Each of us
 has, over the
years,  built up a large  collection of more or  less general rule of
thumb.  A  typical rule  might be  "%bAfter 8am,  the expressway  gets
crowded%a".  One  then applies these rules  to the current situation.
Although  each rule is quite  minuscule in scope, their  union 
suffices to cover most common situations.

Those who have studied such phenomena have frequently selected
quite restricted inductive activities for their subjects. Perhaps
the simplest inductive inference task is that of %bsequence
extrapolation%a [5]. One is given the opening few terms of a sequence,
and asked to guess what the next term is:

1 1 8 1 27 1 64 1 125 1 uu-   ?? ]

The informal rules for this task include 
the concept of splitting the sequence into two or more subsequences
(as in this case, every second term is "1"), the notion of 
successive differences (thereby yielding a new sequence which may be
easier to extrapolate), and finally the notion of repeating and
composing all these preceding techniques until the sequence is reduced
to one that is recognized by inspection (such distinguished
sequences include: constant ones,
the integers in order, their
squares, their cubes, the prime numbers, the Fibonacci sequence, etc.),

Using just such a simple model, it is quite easy to build a computer
program that out-performs humans at this task [6]. Tasks which
draw upon a much larger data base (e.g., cryptograms) cannot be
so easily mechanized.

A full step more sophisticated than sequence extrapolation
is the task of %bconcept formation%a [7]. In the psychologists'
experiments, he may ask a subject to learn to discriminate when
a stimulus is and isn't an exemplar of the concept to be mastered.
Again, simple models exist and lead to concise, effective computer
programs for the task [8,20].

This classificatory activity historically precedes a more
comparative and eventually a metric kind of concept formation.
Ultimately, one crosses the fuzzy boundary and begins to do what
is called %btheory formation%a [9]. 
But even at this sophisticated level, the same simple explanation
suffices: one applies his rules of thumb to the current situation.

Artificial Intelligence work has demonstrated -- often to the
dismay of the researcher -- that many apparently deductive
tasks actually demand a large amount of inductive reasoning.
Automated foreign language translation by machine seemed quite
within reach -- until the first such programs were written.
The same may be said about such "deductive" activities as
proving a theorem, predicting the weather, and identifying
a molecule based on its mass spectrogram. The whole recent
emphasis on Frames [10] is merely the realization that much of
our everyday life is spent in forming simple theories
about our environment (Based partly on limited sense data
and based heavily on past experiences, we have a tentative
model of the room we're in, the state of mind of our
companions, the immediate future, etc.) So inductive
inference permeates our lives, at all levels.


.sec(Scientific Inference)

Nowhere is the use of inductive reasoning so explicit as in
the process of scientific research. The scientific method 
reads like a recipe for induction: constrain attention to a
manageable domain, gather data, perceive regularity in it,
formulate hypotheses, conduct experiments to test them,
and then use their results as the new data with which to
 repeat this cycle again.

The preceding discussion suggests that a good
task domain in which to investigate inductive thinking
is Science itself. Thus, one expects to find psychological studies of
scientists %bin vivo%a, and AI
programs which carry out simple
kinds of scientific research.
Although the former have been sparse, there have been a few recent
attempts at the latter.

The first notable AI program which attempted to mechanize a
scientific activity was "Heuristic DENDRAL" [11].
The program was fed mass spectral data for an unknown
organic compund; it generated hypotheses for fragments
of the compound, then tested them. Heuristics were
used to constrain the generation and to speed up the
test. One may view Dendral as an embodiment of the
rules of thumb used by mass spectroscopists to
produce identifications. 

Other programs have arisen to tackle other scientific
domains. Here we shall merely point to the identification
of protein structure from X-ray crystallographic data [12], the
diagnosis of infectious disease using physicians'
rules explicitly [2], and the automatic formation of new
theories of mass spectroscopy using the input/output of
Dendral as data to induct upon [1].

There has been a gradual realization that the scientist's
rules of thumb should be elicited explicitly. With this
has come the discovery that one's conscious rules are not
sufficient to account for creative scientific behavior.
There must be some meta-level processing [2,4] going on,
some unconscious reliance on rules which modify and create rules,
which modify and create alternate representations. This is the
problem which we propose to attack.

We believe that Mathematics is a good choice of task domain
in which to study inductive inference. As Poincare' [13] says,

.once indent 5,5,5; select b; preface 0;

The genesis of mathematical creation is a problem which should intensely
interest the psychologist. It is the activity in which the
human mind seems to take least from the outside world,
in which it acts or seems to act only of itself and on itself, so that
in studying the procedure of mathematical thought we may hope to
reach what is most essential in man's mind.

 There are several more concrete
reasons supporting this:

.begin; indent 0,4,0; preface 0; spacing 0; once preface 1;

1. In doing  math research, one  needn't cope with the  uncertainties
and  fallability  of  testing   equipment;  that  is,  there  are  no
uncertainties in the data (compared to, e.g., molecular structure inference
from mass spectrograms).

2. Reliance on  experts' introspections is  one of the  most powerful
techniques  for  codifying the  judgmental criteria  necessary  to do
effective work in a field;  we personally have had enough training  in
elementary mathematics  so that we didn't have to rely  completely on
external  sources for  guidance in formulating  such heuristic rules.
Also,  several  excellent  sources  are  available   [13,14,15,21,22,25].

3. The more formal a science is, the  easier it is to automate. For a machine to
carry out research in  psychology would require  more knowledge  about
human information  processing than now  is known,  because psychology
deals with entities as complex as you and I. Also, in a formal science,
the %blanguages%*  to  communicate  information can  be  simple  even
though the %bmessages%* themselves be sophisticated.

4.  Since mathematics  can deal  with any  conceivable constructs,  a
researcher there is not limited to explaining observed data.  Related
to this is the freedom to investigate -- or to give up on -- whatever
the researcher  wants to. There is  no single discovery  which is the
"goal", no given problem to solve, no right or wrong behavior.

5. Unlike "simpler" fields, such as propositional logic, there is  an
abundance of heuristic rules available for the picking.

.END

The limitations of math as a domain  are closely intertwined with its
advantages.   Having no  ties to real-world  data can be  viewed as a
limitation, as can having no  clear goal. There is always the  danger
that  AM will  give up  on each  theory as  soon as  the first  tough
obstacle crops up.

Since  math has been worked  on for millenia by  some of the greatest
minds from  many  different cultures,  it is  unlikely  that a  small
effort  like  AM  would make  any  new  inroads,  have any  startling
insights.  In that respect, Dendral's space was much less explored.
Of course
math -- even at the elementary level that AM explored it -- still has
undiscovered gems (e.g., the recent unearthing of Conway's numbers [21]).

Having chosen Mathematics as our task domain, let us examine
it more closely. This will be done in the next section.


.sec(Plausible Reasoning in Mathematics)

The inadequacy of formal deductive methods in mathematics
has long been argued [13,14,15], often lamented in the
conclusions to resolution-theorem-proving articles [16,17].
An early use of analogic models in geometry-theorem proving
was quite successful [18]. Later attempts at the introduction
of heuristics into resolution theorem-provers include [16,19,22].
It is hypothesized that the limited successes of these schemes
was due to the relatively small number of -- and variety of --
heuristics they possessed, and the fact that resolution was the dominant
driving force in the program.
The reason for the small number of heuristics is simply the
genuine paucity of informal rules of thumb which exist for the
very general environment in which those programs
operate: domain-independent (asemantic) predicate calculus
theorem-proving.

Recently, Lenat has completed a thesis [3] about the mechanization
of the "other half" of mathematical activity, apart  from proving: namely,
defining new concepts and noticing plausible conjectures. 
While his "AM" system had no proof capabilities,
the important point was how it viewed Mathematics. Below is the
model of math research that AM was based upon. It has been
pieced together out of the writings of Poincare', Polya,
Lakatos, Hadamard, and others:

.begin; spacing 0; preface 1; indent 0,3,0; group; once center; select b; preface 1;
uu-MODEL  OF  MATH  RESEARCH]


1. The order in which a math textbook presents a theory is almost the
exact opposite  of the order in which  it was actually discovered and
developed.  In a text, new  definitions are stated with little or  no
motivation, and they turn out to be just the ones needed to state the
next big  theorem, whose proof then magically appears. In contrast, a
mathematician  doing   research  will   examine  some   already-known
concepts, perhaps trying to find some regularity in experimental data
involving them. The patterns he  notices are the conjectures he  must
investigate further, and these relationships directly motivate him to
make new definitions.

.apart;

2.  Each step  the  researcher takes  while developing  a  new theory
involves choosing from  a large set of  "legal" alternatives --  that
is,  searching.   The  key to  keeping this  from  becoming a  blind,
explosive  search  is the  proper  use of  evaluation  criteria. Each
mathematician uses his own  personal heuristics to choose  the "best"
alternative available at each moment.

3.   Non-formal   criteria   (aesthetic  interestingness,   inductive
inference from  empirical evidence,  analogy, and  utility) are  much
more  important   than   formal  deductive   methods  in   developing
mathematically   worthwhile   theories,   and   in  avoiding   barren
diversions.

4. Progress in %bany%*  field of mathematics demands much  non-formal
heuristic  expertise  in  %bmany%*  different  "nearby"  mathematical
fields.  So a  broad, universal  core of  knowledge must  be mastered
before any single theory can meaningfully be developed.

5. It is sufficient (and  pragmatically necessary) to have and  use a
large  set  of  informal  heuristic  rules. These  rules  direct  the
researcher's next activities, depending  on the current situation  he
is  in.   These rules  can  be assumed  to  superimpose ideally:  the
combined  effect of several rules  is just the sum  of the individual
effects.

6. The  necessary  heuristic rules  are  virtually  the same  in  all
branches of mathematics,  and at all levels of  sophistication.  Each
specialized  field will  have some of  its own  heuristics; those are
normally much more powerful than the general-purpose heuristics.

7. For true understanding, the researcher should  grasp$$ 
Have access
to,  relate to,  store,  be able  to  manipulate, be  able to  answer
questions about  $  each  concept  in  several  ways:  declaratively,
abstractly, operationally,  knowing  when it  is relevant,  and as  a
bunch of examples.

8. Common metaphysical  assumptions about nature and  science: Nature
is   fair,  uniform,  and   regular.     Coincidences  have  meaning.
Statistical considerations  are valid  when looking  at  mathematical
data.   Simplicity and  symmetry and  synergy are  the rule,  not the
exception.

.end; skip 2;

Let us try to motivate some of these assumptions, by elaborating
on how (in our view) mathematical discoveries are made.




.sec(Discovery in Mathematics);

Before  discussing  how  to %bsynthesize%*  a  new mathematical  theory,  consider
briefly how to %banalyze%* one, how to construct a plausible chain of
reasoning which stretches from a given discovery all the way
back to well-known concepts.

.subsec(Analysis of a Discovery);

One can rationalize a given discovery by
working  backwards,  by reducing  the  creative  act to  simpler  and
simpler  creative acts.   For example, consider the  concept of prime
numbers.  How might  one be led to  define such a notion,
if he'd never heard of it before?  Notice the
following plausible strategy:

.ONCE spacing 0;  INDENT 9,9,9 SELECT B;

"If f is  a function which transforms elements of  A into elements of
B, and B is ordered, then consider just those members of A which  are
transformed  into  uu-extremal]  elements of  B.    This  set  is  an
interesting subset of A. Name it and study it."

When  f(x) means "divisors  of x", and  the ordering  is "by length",
this heuristic says to consider those numbers which have  a minimal
number of factors
-- that is,  the primes.  So this rule  actually %breduces%* our task
from "proposing the concept of prime numbers" to two more  elementary
problems: "discovering   ordering-by-length"   and   "inventing
divisors-of".

But  suppose we  know this  general rule: %b"If  f is  an interesting
function, consider its inverse."%* It reduces the task of discovering
divisors-of to the simpler  task of discovering multiplication.
Eventually, this task reduces to the discovery of very basic notions,
like substitution,  set-union, and equality.  To  explain how a given
researcher might have  made a  given discovery, such  an analysis  
must be
continued  until that  inductive  task  is reduced  to  "discovering"
notions which the  researcher already knew, which were his conceptual
primitives.

.subsec(Syntheses of Discoveries)


Suppose a  large collection  of these  heuristic strategies has  been
assembled (e.g.,  by analyzing a great  many discoveries, and writing
down new heuristic rules whenever necessary).  Instead of  using them
to %bexplain%* how  a given idea might have evolved,  one can imagine
starting from  a basic core of knowledge and "running" the heuristics
to %bgenerate%*  new concepts.    We're talking  about reversing  the
process  described  in  the  last  section: not  how  to  %bexplain%*
discoveries, but how to %bmake%* them.

Notice that this forward search is much "bushier", much more
explosive, than was the backwards analysis previously described.
So it's much harder to actually make a discovery than to
rationalize -- by hindsight -- how one might have made it.
We all have noticed this phenomenon, the "Why-didn't-I-think-of-that-sooner!!"
feeling.

The forward search is too explosive; we may hypothesize that
the scientist employs some kind of informal rules of thumb
to constrain it. That is, he doesn't really follow rules like
%b"Look at the inverse of each known function f"%a, because that would take
up too much time. Rather, his heuristic rules might be more
naturally stated as productions (condition/action rules) like
this: %b"If f is 1-1 and Range(f) << Domain(f),
Then look at f-inverse.%a"
Henceforth, uu-"%bheuristic rule%a"] will
mean a conditional rule of thumb. In any particular situation,
some subset of these rules will "trigger", and will suggest
a manageable space of plausible activities to perform.
After exploring that space for a while, the situation will
have changed, and the cycle will begin anew.
This double layer of heuristic search shouldn't bother anyone;
it is analogous to performing double induction instead of
standard mathematical induction.

.chap(|DESIGN OF THE uu-AM] PROGRAM|);

Such syntheses are precisely what AM -- and a scientist --  does.  The program consists of a
large corpus  of  primitive mathematical  concepts, each  with a  few
associated  heuristics. Each such heuristic is a situation/action  rule  which functions
  as
a local "plausible move generator".  Some suggest tasks for the system
to carry out,  some suggest ways of satisfying a  given task, etc. 
AM's  activities all  serve to  expand AM  itself, to enlarge  upon a
given body of mathematical knowledge.
 AM uses  its  heuristics as
judgmental criteria  to  guide  development  in  the  most  promising
direction.

.sec(Representation);

.chgfont("ngr20",b);

Each concept  is represented as  a frame-like data structure  with 25
different facets  or slots.  The types of facets include: %bEXAMPLES,
DEFINITIONS, GENERALIZATIONS, DOMAIN/RANGE, ANALOGIES, INTERESTINGNESS,%*
  and  many  others.    Modular  representation  of
concepts provides a convenient scheme for organizing the  heuristics;
for example, the following strategy fits  into the %bEXAMPLES%* facet
of  the %bPREDICATE%* concept: 

.chgfont(B!FONT,b);
.once indent 5,5,5; preface 1; spacing 0;

 %b"If, empirically, 10  times as many
elements uu-fail]  some predicate  P, as  uu-satisfy]  it, then  some
uu-generalization] (weakened version) of  P might be more interesting
than  P"%a.  

.chgfont("ngr20",b);
.once preface 1;

 AM considers  this suggestion  after trying to  fill in
examples of  each  predicate.  In  fact, after  AM  attempts to  find
examples  of  %bSET-EQUALITY%a,  so few  are  found  that  AM decides  to
generalize that predicate.  The result is the creation of 
several new predicates, one of which happens to mean
"Has-the-same-length-as" -- i.e., a rudimentary  precursor to natural
numbers.  

.chgfont(B!FONT,b);

Below is a typical concept, PRIMES, in a state long after AM defined
and explored it. Appendix I contains examples of another concept,
COMPOSE, both in this Anglicized form and in the actual LISP coded
form.

.begin nofill; preface 0; indent 3; retain; 
.at "MBOX" stuff "$" ⊂
stuff
.⊃

.group;
MBOX NAME: Prime Numbers $
MBOX DEFINITIONS: $
MBOX 		ORIGIN: Number-of-divisors-of(x) = 2 $
MBOX 		PREDICATE-CALCULUS: Prime(x) ≡ (∀z)(z|x α→  z=1 α⊗ z=x) $
MBOX 		ITERATIVE: (for x>1): For i from 2 to Sqrt(x), ¬(i|x) $
MBOX $
MBOX EXAMPLES: 2, 3, 5, 7, 11, 13, 17 $
MBOX 		BOUNDARY: 2, 3 $
MBOX 		BOUNDARY-FAILURES: 0, 1 $
MBOX 		FAILURES: 12 $
MBOX $
MBOX GENERALIZATIONS: Nos., Nos. with an even no. of divisors, Nos. with a prime no. of divisors $
MBOX $
MBOX SPECIALIZATIONS: Odd Primes, Prime Pairs, Prime Uniquely-addables $
MBOX $
MBOX CONJECS: Unique factorization, Goldbach's conjecture, Extremes of Number-of-divisors-of $
MBOX $
MBOX INTU'S: %bA metaphor to the effect that Primes are the building blocks of all numbers%* $
MBOX $
MBOX ANALOGIES: $
MBOX 		Maximally-divisible numbers are converse extremes of Number-of-divisors-of $
MBOX 		Factor a non-simple group into simple groups $
MBOX $
MBOX INTEREST: Conjectures tying Primes to TIMES, to Divisors-of, to closely related operations $
MBOX $
MBOX WORTH: 800 $
MBOX $
.apart; end; skip 2;

"Creating a  new  concept" is  a well-defined  activity: it  involves
setting  up a new  data structure like the one above,  and filling in
entries for some  of its facets  or slots.   Filling in a  particular
facet  of a  particular concept  is also  quite well-defined,  and is
accomplished  by executing a collection  of relevant heuristic rules.

.sec(Control);

.chgfont(B!FONT,b);


AM is initially  given a collection of 115 core concepts, with only a
few facets filled in for each.   Its sole activity is to choose  some
facet  of some  concept, and  fill in  that particular  slot.   In so
doing,  new  notions  will  often  emerge.    Uninteresting ones  are
forgotten, mildly interesting ones are kept as parts of  one facet of
one   concept,   and  very   interesting   ones   are  granted   full
concept-module status. Each of these new modules has dozens of  blank
slots, hence the space of possible actions  (blank facets to fill in)
grows  rapidly.   The same  heuristics are used  both to  suggest new
directions for investigation, and to limit attention: both  to sprout
and to prune.

.<<chgfont("ngr20",b);>>


The fundamental kind of task which AM performs, its basic action,
is to fill in a given facet of a given concept. To decide which
such task  to work on next, AM maintains an ii-agenda] of tasks,
a global job-list ordered by 
priority.    A typical task is
%b"Fill-in examples of  Primes"%*. The agenda may contain hundreds of
entries such as  this one.   AM  repeatedly selects the top task from  the
agenda  and  tries to  carry  it  out.   This  is  the whole  control
structure! Of  course, we must still explain how AM creates plausible
new tasks to place on  the agenda, how AM decides which task  will be
the best one to execute next, and how it carries out a task.

If the task is %b"Fill in new Algorithms for Set-union"%*, then 
%bsatisfying%* it would mean actually synthesizing some new procedures,
some new LISP code capable of forming the union of any two sets.
A heuristic rule is %brelevant%* to a task iff executing that rule brings
AM closer to satisfying that task. 
Relevance is determined a priori by where the rule
is stored. A rule tacked onto the Domain/range facet of the Compose
concept would be presumed relevant to the task %b"Check the Domain/range
of Insert-o-Delete"%a.

Once  a task is  chosen from  the agenda,  AM gathers  some heuristic
rules which might  be relevant  to satisfying  that task.   
They  are
executed, and then AM picks  a new task.  While a  rule is executing,
three kinds of actions or effects can occur:

.begin; indent 4,4,0; spacing 0; preface 1; once  indent 0,4,0;

(i)  Facets of  some concepts  can get filled  in (e.g.,  examples of
primes may actually be found and tacked onto the "Examples"  facet of
the "Primes"  concept).   A typical heuristic  rule which  might have
this effect is:

.B816

To  fill in  examples of X,  where X  is a kind  of Y  (for some more
general concept Y),

Check the examples of Y; some of them may be examples of X as well.

.end;

For the task of filling  in examples of Primes, this rule  would have
AM notice  that Primes is a  kind of Number, and  therefore look over
all the known examples of Number. Some of those would be primes, and
would be transferred to the Examples facet of Primes.

.once indent 0,4,0;

(ii) New concepts may be created (e.g., the concept "primes which are
uniquely representable as the sum of two other primes" may be somehow
be  deemed worth  studying).   A typical  heuristic rule  which might
result in this new concept is:

.B816

If some (but not most) examples of X are also examples of Y (for some
concept Y),

Create a new concept defined as  the intersection of those 2 concepts
(X and Y).

.end;

Suppose AM has already isolated the concept of being representable as
the sum of two primes in only one way (AM actually calls such numbers
"Uniquely-prime-addable numbers").   When AM notices that some primes
are in this  set, the  above rule will  create a  brand new  concept,
defined as the set of numbers which are both prime and uniquely prime
addable.

.once indent 0,4,0;

(iii)  New  tasks may  be  added  to the  agenda  (e.g.,  the current
activity may suggest  that the following  task is worth  considering:
"Generalize the concept of prime numbers").  A typical heuristic rule
which might have this effect is:

.B816

If very few examples of X are found,

Then  add the following  task to the agenda:  "Generalize the concept
X".

.end;

Of course, AM contains a  precise meaning for the phrase  "very few".
When  AM looks for  primes among  examples of already-known  kinds of
numbers, it will find dozens of  non-examples for every example of  a
prime it  uncovers.  "Very  few" is thus  naturally implemented  as a
statistical  confidence level.

.END

The concept of  an agenda is certainly not new:  schedulers have been
around  for a long  time.  But  one important feature  of AM's agenda
scheme %bis%*  a  new idea:  attaching  -- and  using  -- a  list  of
quasi-symbolic 
 reasons  to each  task which explain  why the task  is worth
considering, why it's  plausible.   uu-It is the  responsibility of  the
heuristic rules to include  reasons for any tasks they  propose]. 
 For
example,  let's  reconsider  the heuristic  rule  mentioned  in (iii)
above.  It really looks more like the following:

.chgfont(B!FONT,b); B816

If very few examples of X are found,

Then add the  following task to the  agenda: "Generalize the  concept
X", for  the following reason: "X's  are quite rare; a  slightly less
restrictive concept might be more interesting".

.end;


If the same task is proposed by several rules, then several different
reasons for it  may be  present.  In  addition, one ephemeral  reason
also exists: "Focus of attention". Any tasks which are similar to the
one last executed  get "Focus of  attention" as a  bonus reason.   AM
uses all these  reasons, e.g.   to decide  how to rank  the tasks on  the
agenda.   The  "intelligence" AM  exhibits  is not  so much  "what it
does", but rather the %border%* in which it arranges its agenda.$$ For
example, alternating a randomly-chosen task  and the "best" task (the
one AM  chose to do) only slows the system down by a factor of 2, yet
it totally destroys its  credibility as a rational researcher
(as judged by the human user of AM).   This
is   one   conclusion   of an  experiment performed upon AM
recently. $
AM uses  the  list  of
reasons in another way: Once a task has been selected, the quality of
the reasons  is used to decide how much time  and space the task will
be permitted to absorb, before AM  quits and moves on to a new  task.

.chgfont("ngr20",b);

Executing  a task  is achieved by locating  relevant rules  of thumb and
evaluating  them. The  location is performed  efficiently because all
the concepts are  related by generalization/specialization links, and
because  the following  key heritability property  holds:  Any method
for  filling in facet  f of concept C  will also work  for filling in
facet  f of any specialization  of C.
Thus when  the task  "Fill  in
examples  of ii-SET-EQUALITY]" is  chosen,
AM asks each generalization of ii-SET-EQUALITY]
for  help. Thus  it asks  for ways  to fill in examples of  any Predicate, any
Activity, any Concept, and finally for ways to fill in examples of Anything.
One such heuristic rule known to the Activity concept says: "Actually
execute the activity on some random members of its domain." Thus to
fill in examples of ii-SET-EQUALITY], its domain facet is inspected,
and AM sees it takes a pair of objects as its arguments. Then AM
accesses the Examples facet of the concept ii-OBJECT], where it finds a
large list of objects. Obeying the heuristic rule, AM repeatedly picks
a pair of them at random, and sees if they satisfy ii-SET-EQUALITY]
(by actually running the LISP function stored in the Algorithms
facet of ii-SET-EQUALITY]).
While this will typically return False, it will occasionally locate
-- by random chance -- a pair of equal sets.

Other heuristics, tacked onto other generalizations of %bSET-EQUALITY%a,
provide additional methods for executing the task "Fill in examples of
%bSET-EQUALITY%a.
A heuristic stored on the concept ii-ANY-CONCEPT] says
to symbolically instantiate the definition. A bag of tricks is provided
for this purpose, one of which 
("instantiate the base step of the recursion")
works nicely on the recursive definition
provided for ii-SET-EQUALITY].
Notice that, as one might expect, the more general the concept is, the
weaker (more time-consuming) its heuristics are.
For this reason, AM consults each concept's rules  in order of increasing
generalization.

The interested reviewer may wish to glance at Appendix II, which
contains a few of the 253 heuristic rules AM possessed.

.skip to column 1; chap(RESULTS);

As the last section  would indicate, the apparent dynamic behavior
of AM was as follows: a task is chosen from the agenda,
relevant heuristic rules are located and executed, and the
cycle repeats. During execution of the rules, three types of
events occur: blank facet(s) get filled in, new concept(s)
are defined, and new task(s) are added to the agenda. A modest
facility exists for having AM print out a description of these
activities as they occur. Section 4.1 presents an excerpt of
this self-trace  monologue. 
At a higher level, Section 4.2 describes the
overall paths followed by the program during its best single run.

.sec(An Excerpt)

.ONCE TURN ON "{}"

The purpose of  the example which comprises the last half of this section  is to
convey  a bit of  AM's flavor. After  reading through  it, the reader
should be convinced that  AM is %bnot%*  a theorem-prover, nor is  it
%brandomly%*  manipulating entries  in a  knowledge base,  nor is  it
%bexhaustively%* manipulating  or searching.  AM is carefully growing a network
of data structures representing mathematical concepts,  by repeatedly
using heuristics both (a) for guidance in choosing a task to work on next, and
(b) to provide methods to satisfy the chosen task.


The following points are important  but can't be conveyed by any lone
example:

.begin; indent 4,4,0; once indent 0,4,0;

(i)  Although  AM  appears   to  have  reasonable  natural   language
abilities, this  is a  typical AI  illusion: most  of the phrases  AM
types  are mere tokens,  and the syntax  which the user  must obey is
unnaturally constrained. For the sake of clarity, I have "touched up"
some of  the wording, indentation,  syntax, etc. of what  AM actually
outputs,  but left the spirit  of each phrase intact.   If the reader
wishes,  he may  glance at  Appendix III,
which shows some actual listings of AM in action. 

.ONCE indent 0,4,0;

(ii)  The reader  should  be skeptical
of  the generality  of  the
program; is  the knowledge base  "just right" (i.e., finely  tuned to
elicit  this one chain of  behaviors)?  The  answer is "%bNo%*":
The whole point of this project was
to show that a relatively small set of general heuristics can guide a
nontrivial discovery process.  Each activity, each task, was proposed
by some heuristic rule (like "look for extreme cases of X") which was
used time and time again, in many situations.  It was  not considered
fair  to insert  heuristic guidance  which  could only  "guide" in  a
single situation.

.ONCE PREFACE 1

This  kind of generality can't be  shown convincingly in one example.
The reviewer is pointed to Chapter 2 of [3], for a much longer excerpt
of AM in action. There he may see that the same line of development which  leads to decomposing numbers  (using TIMES-inverse) and
thereby discovering unique factorization,  also leads to  decomposing
numbers (using ADD-inverse) and thereby discovering Goldbach's conjecture.

.END; APART;


Let me  reemphasize that the "point"  of this example is  %bnot%* the
specific   mathematical  concepts,  nor   the  particular  chains  of
plausible reasoning AM  produces, nor the  few flashy conjectures  AM
spouts,  but rather  an illustration  of the  %bkinds%* of  things AM
does.


Recall that  in general, each  task on the  agenda will  have several
reasons attached to it.  In the example excerpt, the reasons for each
task are  printed just  after the  task is  chosen,  and before  it's
executed.

AM numbers its activities sequentially. Each time  a new task is chosen, a counter
is  incremented. The  first task in  the example  excerpt is labelled
TASK 65, meaning that the example skips the first  64 tasks which
AM selects and carries out.
The reason simply is that the development of simple concepts related to
divisibility will probably be more intelligible and palatable to the reader,
than AM's early ramblings in finite set theory.


.ONCE TURN OFF "{}"


The  concepts that  AM talks  about  are self-explanatory  -- by  and
large.
"NUMBER"   will mean (typically) a natural number (a nonnegative
integer).
"DIVISORS-OF (x)", for any  number x,
 is the set of all positive numbers which divide evenly
into x.  For example, Divisors-of(18) = {1 2 3 6 9 18}.



Now here is the excerpt itself:

.begin nofill indent 0; preface 0; chgfont(B!FONT,b);
.turn on "α\"; tabs 5,14;


** uu-TASK 65:] **   Fill in Examples of the concept "Divisors-of".

\3 Reasons:\(1) No known examples of Divisors-of so far.
\\(2) TIMES, which is related to Divisors-of, is now very interesting.
\\(3) Focus of attention: AM recently defined Divisors-of.

26 examples found, in 9.2 seconds. e.g., Divisors-of(6)=α{1 2 3 6α}.


** uu-TASK 66:] **  Consider numbers having small sets of Divisors-of.

\2 Reasons:\(1) Worthwhile to look for extreme cases.
\\(2) Focus of attention: AM recently worked on Divisors-of.

Filling in examples of numbers with 0 divisors.
   0 examples found, in 4.0 seconds.
   Conjecture: no numbers have precisely 0 divisors.

Filling in examples of numbers with 1 divisors.
   1 examples found, in 4.0 seconds. e.g., Divisors-of(1) = α{1α}.
   Conjecture: 1 is the only number with precisely 1 divisor.

Filling in examples of numbers with 2 divisors.
   24 examples found, in 4.0 seconds. e.g., Divisors-of(13) = α{1 13α}.
   No obvious conjecture. May merit more study.
   Creating a new concept: "Numbers-with-2-divisors".

Filling in examples of numbers with 3 divisors.
   11 examples found, in 4.0 seconds. e.g., Divisors-of(49) = α{1 7 49α}.
   All numbers with 3 divisors are also Squares. Definitely merits more study.
   Creating a new concept: "Numbers-with-3-divisors".


** uu-TASK 67:] **  Consider the square-roots of Numbers-with-3-divisors.

\2 Reasons:\(1) Numbers-with-3-divisors are unexpectedly also Perfect Squares.
\\(2) Focus of attention: AM recently worked on Nos-with-3-divisors.

   All square-roots of Numbers-with-3-divisors seem to be Numbers-with-2-divisors.
        e.g.,  Divisors-of(Square-root(169)) = Divisors-of(13) = α{1 13α}.
   Even the converse of this seems empirically to be true.
        i.e.,  the square of each No-with-2-divisors seems to be a No-with-3-divisors.
        The chance of coincidence is below acceptable limits.
   Boosting the interestingness rating of each of the concepts involved.


** uu-TASK 68:] **  Consider the squares of Numbers-with-3-divisors.

\3 Reasons:\(1) Squares of Numbers-with-2-divisors were interesting.
\\(2) Square-roots of Numbers-with-3-divisors were interesting.
\\(3) Focus of attention: AM recently worked on Nos-with-3-divisors.


.end

This last task goes nowhere, and is a good place to terminate the
excerpt and this section.


.sec(AM as a Mathematician)

Now that we've seen how AM works, and we've been exposed to a bit of
"local" results, let's take a moment to discuss the
totality  of the  mathematics  which  AM carried  out.    Like a
contemporary  historian  summarizing  the  work  of  the   Babylonian
mathematicians, we shan't hesitate to use current terms and criticize
by current standards.

AM   began  its  investigations  with  scanty   knowledge  of  a  few
set-theoretic concepts.
Most  of the obvious  set-theory relations  (e.g., de  Morgan's laws)
were eventually uncovered; since  AM never fully understood  abstract
algebra, the statement  and verification of  each of these  was quite
obscure.    AM never  derived a  formal  notion of  infinity,  but it
naively established conjectures like "a set can never be a  member of
itself", and procedures for making chains  of new sets ("insert a set
into  itself").  No sophisticated  set theory (e.g., diagonalization)
was ever done.

After this initial period of exploration, AM  decided that "equality"
was   worth  generalizing,  and   thereby  discovered   the  relation
"same-size-as".  "Natural numbers" were based on this, and soon  most
simple arithmetic operations were defined.

Since addition arose as  an analog to union, and  multiplication as a
repeated  substitution,
it came  as
quite  a surprise  when AM  noticed that  they were  related (namely,
N+N = 2xN).  AM later re-discovered multiplication in three other ways:
as repeated addition, as the numeric analog  of the Cartesian product
of seion of square-root.  AM
remained   content   to    play   around   with   the    concept   of
ii-integer]-square-root.  Although  it isolated  the  set of  numbers
which  had  no  square  root,  AM  was  never  close  to  discovering
rationals, let alone irrationals. No notion of "closure" was provided
to -- or discovered by -- AM.

Raising to fourth-powers, and fourth-rooting, were discovered at this
time.  Perfect squares and perfect fourth-powers were isolated.  Many
other numeric operations  and kinds of  numbers were found
to be of interest:  Odds,
Evens,  Doubling,   Halving,  etc.   Primitive  notions   of  numericion of square-root.  AM
remained   content   to    play   around   with   the    concept   of
ii-integer]-square-root.  Although  it isolated  the  set of  numbers
which  had  no  square  root,  AM  was  never  close  to  discovering
rationals, let alone irrationals. No notion of "closure" was provided
to -- or discovered by -- AM.

Raising to fourth-powers, and fourth-rooting, were discovered at this
time.  Perfect squares and perfect fourth-powers were isolated.  Many
other numeric operations  and kinds of  numbers were found
to be of interest:  Odds,
Evens,  Doubling,   Halving,  etc.   Primitive  notions   of  numeric
inequality were defined but AM never even discovered Trichotomy.


The  associativity and commutativity of multiplication indicated that
it could accept a  BAG of numbers as  its argument.  When  AM defined
the inverse  operation corresponding to Times,  this property allowed
the definition to be: "any bag  of numbers (>1) whose product  is
x".     This  was  just   the  notion  of   factoring  a   number  x.
Minimally-factorable  numbers turned out  to be what  we call primes.
Maximally-factorable numbers were also thought to be interesting.

Prime pairs were discovered in a bizarre way: by restricting
the domain and range of  addition to primes (i.e., solutions
of p+q=r in primes).


AM
conjectured   the   fundamental   theorem   of   arithmetic   (unique
factorization into  primes)  and Goldbach's  conjecture  (every  even
number >2 is the sum of two primes)  in a surprisingly symmetric way.
The unary representation of numbers gave way to a representation as a
bag of primes (based on  unique factorization), but AM never  thought
of  exponential  notation.   
Since  the  key concepts  of  remainder,
greater-than,  gcd, and exponentiation  were never mastered, progress
in number theory was arrested.

When a new base of ii-geometric] concepts was added, AM began finding
some more  general associations.  In place  of the strict definitions
for  the  equality  of  lines,   angles,  and  triangles,  came   new
definitions  of concepts  we  refer  to as  Parallel,  Equal-measure,
Similar,  Congruent, Translation, Rotation,  plus many  which have no
common name (e.g. the relationship of two triangles sharing  a common
angle).  A cute geometric interpretation of Goldbach's conjecture was
found  [Given   all  angles   of   a  prime   number   of  degrees,
(0,1,2,3,5,7,11,...,179 degrees),  then any angle  between 0 and  180
degrees can be approximated (to within 1 degree) as the sum of two of
those  angles.]     Lacking  a   geometry  "model"  (an   analogic
representation like  the one  Gelernter employed [18]),  AM was doomed  to
propose many implausible  geometric
conjectures.



.chgfont("BDR40",b);
.sec(|Limitations of uu-AM]|);

Although AM fared well according to several different measures
of performance (see Section 7.1 in [3]), of greatest significance
from the point of view of this proposal
are its %blimitations%a. This section will try to characterize 
some of them, and then Section 6 will prescribe some remedies.


As AM ran longer and longer, the concepts it defined were further
and further from the primitives it began with. Thus "prime-pairs"
were defined using "primes" and "addition", the former of which
was defined from "divisors-of", which in turn came from
"multiplication", which arose from "addition", which was defined
as a restriction of "union", which (finally!) was a primitive
supplied to AM initially. When AM subsequently needed help with
prime pairs, it was forced to rely on rules of thumb supplied
initially about %bunion%aing. Although the heritability 
property of heuristics did ensure that those rules were
still valid, the trouble was that they were too general, too
weak to deal effectively with the specialized notions of
primes and arithmetic.

As the derived concepts moved further away from finite set
theory, the efficacy of the initial heuristics decreased.
AM began to "thrash", appearing to lose most of its
heuristic guidance. It worked on concepts like "prime
triples", which may be reasonable to investigate if one knows
nothing about numbers, but which is obcene to one who does.
As another example, AM didn't realize that the unique factorization
of numbers into primes (UFT) was a much more significant conjecture
than Goldbach's anomaly.

The key deficiency was that of adequate %bmeta%a-rules[2,3,4]:
heuristics which cause the creation and modification of
new heuristics. Here is one such rule, which would
have taken care of the "Goldbach vs. UFT" problem:

.once select b; spacing 0; indent 5,5,5;

After applying the %a"look at the inverse of extrema"%b heuristic,
and thereby defining a new concept C (as f-inverse of b),
Synthesize a heuristic which indicates that conjectures involving
C and f (or f-inverse) are very significant and natural, whereas
those involving C and unrelated operations are probably anomalies.

How would this meta-rule be used? When primes are defined
as the inverse image of doubletons, under the operation "divisors-of",
the meta-rule would trigger, and a new rule would be synthesized.
That new heuristic would say that conjectures about primes were
natural iff they involved multiplication or division.
Thus the UFT would be rated as important, and Goldbach's
conjecture as cute but useless.


Aside from the preceding major limitation, most of the other
complaints are quite superficial. Many concepts one might
consider "primitive" are missing from AM: proof, tricks for
finding counterexamples, numbers, etc. Very few of them are
ever discovered by AM, and even those which are discovered
will not have any powerful heuristics filled in for them.


Analogies in general were under-utilized. Specifically,
analogies between heuristics were never even considered.
If one characterizes an analogy as a (partial) correspondence
between two collections of objects and operators, then it is
a small conceptual step to imagine heuristic rules which
look for and develop such mappings: the image of partial
matching comes immediately to mind. AM possessed a
few domain-independent such rules, and they managed to produce
some analogies (e.g., between multiplication and addition;
between sets/union/same-size and numbers/add/equality), but
the overall results were quite meager in this area.


Some notion of physical intuition should be present,
perhaps. One characterization is that of an abstract
representation for objects and operators; another is that
of archetypical examples. The first is useful for
guessing a plan of attack, the latter for inducing a
conjecture from just a single instance of it. But
these notions don't fully cover what mathematicians
colloquially mean by "intuition". More research must
be done in this area.


.sec(Conclusions about uu-AM]);


What has this project accomplished to date?

It is a living demonstration that a few hundred general heuristic rules
suffice to guide a searcher -- an automated math researcher --
as it explores and expands a large but incomplete base of math concepts.
That is, AM stands as a constructive existence proof that creative
theoretical research can be effectively modelled as heuristic search,
just as Meta-Dendral [1] establishes a similar hypothesis for 
the more constrained, real-world field of
mass spectroscopy.

AM introduces a control structure based upon an agenda of
small plausible research tasks, each with a list of
supporting reasons attached.

The main successes were the few novel ideas it came up
 with$$ A new result in number theory, dealing with numbers
having very uu-many] divisors, is presented as Appendix 4 of [3]. $,
the ease with which new domains were discovered (e.g., number theory)
or introduced by hand (plane geometry), and the apparently rational
sequences of behavior AM exhibited.

.chap(|THE NEXT STEP:  THIS PROPOSAL|);

.sec(What Will be Done);

The AM  program gradually  began to  lose its  momentum, due  to  its
inability  to synthesize new heuristics  for dealing efficiently with
the new  concepts it created.  This proposal  is for the continuation
of the  AM project along these lines:  the introduction of meta-level
rules into  the program.  Their effect on AM's  behavior should be to
keep it from thrashing due  to lack of specific heuristics.  The need
for  meta-level  abilities  in  theory  formation  has  already  been
demonstrated  by AM; the  incorporation of such  skills will probably
reveal some new kind of lack  in AM, one more obstacle on the path to
understanding inductive thinking.

Some effort will be expended to correct some of the other limitations
of AM mentioned earlier (in Section 4.3). A large number of
additionall concepts will be present in AM's initial core,
including several dealing with Proof, Numbers, Infinity,
Continuity, and elementary concepts in quite varied fields.
Other concepts will give AM a handle on understanding and
modifying itself: a separate concept will describe each facet,
each category of reason that may support  tasks
on the agenda, each function
which is part of AM's control structure itself, each task
currently on the agenda, etc. The same knowledge which permits
AM to modify mathematical concepts should be capable of dealing
with these notions as well. Other efforts will be spent on
improving AM's use of analogy, physical intuition, and formal proof.

.sec(How This Will be Accomplished)


Formulating a rich enough collection of meta-rules is a difficult
task, but we believe we have a handle on how to accomplish it.
For the original AM system, heuristics were extracted by having
an expert examine a particular mathematical discovery, then
brainstorm on how it might plausibly have been made. The
resultant scenario was then usually nothing more than a disguised
heuristic rule. The same kind of procedure appears to be
applicable in this case, to cull meta-level knowledge.
The expert examines a particular heuristic,
then tries to rationalize it, to create a scenario in which it
is discovered. That scenario is then reworded into a
meta-heuristic. 

Here is an example of this extraction process:

.begin preface 0; indent 0,3,0; spacing 0; once preface 1;

a)  The mathematician is asked to consider the concept of
prime numbers. How might one be led to define such a notion?
He arrives at a scenario, in which someone has just thought
about the concept "divisors-of" a number. The primes are simply
those numbers which have an extreme  (i.e., very small) value
under this function. 

b) The scenario is translated into a general heuristic rule.
The mathematician's  key idea was that the inverse image of extrema are
worth studying. All that is left to do is to express it in rule form.

c) Suppose that heuristic rule ("%bif f is an interesting function from
A to B, and b is an extreme kind of B, then it's worth defining and
investigating the f-inverse image of b%a") has been used productively
many times. The mathematician is now asked how someone might plausibly
have discovered that technique. That is, he must rationalize the
discovery of that heuristic method.

d) Eventually,  he may articulate some  very general meta-rule, e.g.:
"%bComplete  a syllogism of the  form `%aB is  to A as b  is to ?%b'
%a[20]%b,
when  the connections between B  and b --  and between B and A -- are
known%a". In  this  case, the  connection between  B  and A  is  just
%bf-inverse%a, and  between B and b  is just %bsubset-of%a.  Applying
the "syllogism" meta-rule would therefore yield the above "inverse of
extrema" heuristic rule.

e) Moreover,  we may ask the mathematician if  he notices any tricks,
any  heuristics, which are  useful for dealing with  all the concepts
discovered as a result of using the "inverse of extrema" rule. He'll look
at    instances   of    its   use   (the discoveries    of   primes   and
maximally-divisible numbers  using  the function  "Divisors-of",  the
discoveries of discrete and indiscrete topologies using union and
intersection, the discoveries  of simple
groups using factorization into subgroups, etc.),
and he may perceive that the synthesized concept (the inverse image)
is always intimately tied to the function f and its inverse.

f) This observation can also be phrased as a meta-rule: "%bAfter
creating a new concept C as f-inverse of b, tack a heuristic onto C
which states that C is intimately connected to f and f-inverse;
conjectures involving C and f (or f-inverse) will be very natural.%a"
This meta-rule might be invoked when primes are defined; it would
create a new heuristic rule and attach it to Primes: "%bPrimes
are intimately connected to multiplication and division.%a"
When the unique factorization theorem is first conjectured, that
heuristic would interject that it was quite "natural".
Goldbach's conjecture would of course  %bnot%a receive this extra support.

.end;

The above should indicate how heuristics were -- and can be -- generated
by human beings; how heuristics might be automatically generated by more
general meta-heuristics; how the meta-heuristics can be readily generated
by people; and finally a hint that there are no further levels to ascend:
if a program can generate new heuristic rules, it can generate any kind
of rules, even meta-level ones.
The feasibility of this endeavor hinges on our ability to extract a
sufficient set of meta-rules; that ability has now been demystified.

.sec(Timetable)

 The
timetable for the project is of course dependent upon
intermediate results, but a rough plan exists even at present:


.begin preface 0; indent 3,6,0; spacing 0;

i) By July 1, 1977 (initiation of support), the AM program will
be re-implemented at CMU, at or above the level it currently
exists at Stanford (i.e., as described herein and in [3]).

ii) By January 1, 1978, a large collection of meta-rules will
have emerged, and experimentation will begin on adding new
rules, new concepts, letting the system explore various
fields with varying amounts of human interaction. 

iii) By July 1, 1978, this experimentation will have proceeded
sufficently far for a thrust along a new dimension,
tentatively that of change of representation.

iv) By January 1, 1979, reports of
progress will have emerged through the usual scientific
channels (journals, conferences, and technical reports).

v) By July 1, 1979 (termination
of support under this grant), whatever
limitation(s) AM has uncovered will have been  explored.
The rules and meta-rules will have been analyzed and codified.
 If some
continuation is indicated, a new proposal will be made.


.end;
.sec(Significance);

The impact of this new research will  be fourfold:

.begin preface 0; indent 4,7,0;

1.  Immediately and concretely,  the performance level of  AM will be
raised significantly. Mathematicians are willing to work with the new
system to try to produce new discoveries.

2. The methods used (representation of concepts, agenda with symbolic
reasons for each task,  heuristics encoded into production rules,
role of meta-heuristics in guiding search) are
themselves advancing  the state of the art in  AI. Work on AM permits
an ideal  environment  in  which to  test  and improve  these  ideas.
Eventually,  some   variants  of  them  may  be   used  in  other  AI
applications  programs, wherever  a rich  store of  informal rules of
thumb exists to guide a complex inductive process.

3. Results  from working on the new system  should shed much light on
the  process   of   scientific  inference.   How  general   are   the
meta-heuristics? What kinds of heuristics can't be readily discovered
by very  general meta-level rules? Are  new heuristics created mainly
by hindsight, by analogy, by specializing more general ones,...?

4. The  limitations of  the augmented  AM system  may point  to  some
capability unexpectedly necessary for creative theory formation. Just
as AM indicated that  meta-level knowledge would be needed, to create
new rules  and change the set of facets  each concept may possess, so
this  new  research  may   indicate,  e.g.,  the  need  for  physical
intuition, sociological and political  concepts, current
open problems of great contemporary import to focus one's efforts
upon, etc.
.

.end;

In Appendix V, some long-range applications of this research
are discussed.
 They are quite speculative, and therefore
have not been included within the body of this proposal.


.sec(Appropriateness of Available Resources);


CMU's Computer Science Department has a long history of outstanding
work in Artificial Intelligence. It is associated with names like
Newell, Reddy, Simon, Erman, Waldinger, Manna; with research projects
like the Hearsay system and PSG. It has three large machines
(PDP-10's) for CS research use exclusively, excellent languages
(various LISP's, SAIL,  BLISS, L*, etc.), competent faculty and graduate
students to critique intermediate aspects of the work, and an
unusually closely allied mathematics department, with
professors (Andrews, Fix) willing to interact with the project
(including in the necessary role of devil's advocate).
Finally, as the joint nature of this proposal should indicate,
there is a close bond between CMU's CS and Psychology departments.
The latter is heavily oriented towards information processing
psychology and specifically understanding, cognition, and
problem-solving (Klahr, Hayes, Newell, Simon). Finally, CMU is linked via the
ARPANET to the remote SUMEX site, where additional computer power
and scientific consultations have been offerred to us for this
project.

Lenat and Simon are  logical choices
to carry on this project. The former 
is the creator of the original AM system, and holds a master's
degree in mathematics. The latter is the
author of many related articles and texts bearing on the issues
of how scientific discoveries are made.

In short, CMU has adequate facilities,
and the authors have adequate qualificiations,
to ensure a high probability of success in this research.

.sec(Budget)

On the next page is the proposed budget for this project.
A couple notes are in order: While Professor Simon will be
expending a considerable amount of time on this research,
no percentage of his salary need be paid by it.
The item mentioned under "Terminal" is a Mini-Bee (model 4).
All other budget items are relatively standard.


.page←page+1;

.spacing 0; preface 1; indent 0,0,0;
.begin "compose";
.send contents⊂
.chgfont(H2FONT,B);
.preface 0; spacing 0;

.⊃

.A!PPENDIX(REPRESENTATION OF CONCEPTS);

This appendix presents one of the original concepts provided
to AM -- Composition of relations -- in both its original
LISP form and a nice English translation. The facets
corresponding to heuristics associated with Compose
are not translated into English here. Rather, they are
provided as Appendix II of this proposal. 

.page←page+8;
.end "compose";
.begin "heuri";

.A!PPENDIX(A FEW OF AM's HEURISTICS);

Below are the 24 heuristic rules associated with the concept
"Compose". Most concepts had some heuristics associated with
them; the more general the concept, the more rules it had.
For a complete picture of the Compose concept, turn to
Appendix I of this proposal. The interested reader can
find therein the LISP encodings of these heuristics.
Since this material has been excerpted from [3], the
reviewer is asked to excuse the anomalous page headings
and numberings.

.page←page+3;
.end "heuri";
.begin "typscr";

.A!PPENDIX(|AN ACTUAL PRINTOUT OF uu-AM] RUNNING|);

Here is the way the AM program begins a typical run.
The human user's typing appears in italics. He first
must type %b(START)%a to the system, after which it
asks him about various parameter settings. Finally,
the main loop (Select a task, Gather relevant heuristics,
Execute them) is begun.
Bear in mind that these nine pages represent just three
per cent of one single run's typout (about 9 tasks are
shown, out of an ultimate 300).

The typeout has not been changed in any way
(except that garbage-collection messages have been suppressed),
hence it may be
 difficult to follow.
A very condensed trace of an entire run is present
in the next appendix of this proposal.

A task here is called a "CAND"; the agenda is referred to
as the list "CANDS". Equality of objects is called "OBJ-EQUAL".
The user has AM type out the top three tasks on the
agenda just before it picks the top one each "cycle".

.page←page+9;


.end "typscr";
.begin "aabtrac";

.A!PPENDIX(|A TASK-BY-TASK CONDENSED TRACE|);

Below is a condensation of one run of the AM program.
Each task has been numbered and summarized. Following
this linear trace is a two-dimensional graph of AM's
behavior in concept-space. Each concept points to a
conceptual offspring, and is labelled by the numbers
of the tasks dealing with it. Thus, e.g., if the numbers "jump"
from one node to a connecting one, then so did AM.
Finally, after the linear and two-dimensional traces, comes
a list of all the concepts discovered by AM during this run.

.page←page+9;


.end "aabtrac";
.begin "impli";

.A!PPENDIX(|LONG-RANGE GOALS|);



.page←page+1;


.end "impli";
.begin "vitee";

.A!PPENDIX(VITAE);
This section contains the vitae of the two principal
co-investigators of this project. Included within are
lists of their relevant publications within the last
five years.
The %bnext%a appendix (Appendix VII) contains the references
cited in the proposal.

.page←page+12;


.end "vitee";
.begin "ref";

.A!PPENDIX(CITED REFERENCES);

.fill; indent 0,4,0; tabs 5; turn on "\"; spacing 0; preface 1;

[1]\Bruce  G. Buchanan, %bApplications of Artificial Intelligence to Scientific
Reasoning%a, Second USA-Japan Computer Conference, published by
AFIPS and IPSJ, Tokyo, 1975, pp. 189-194.

[2]\Randall Davis, %bApplications of Meta Level Knowledge to the
Construction, Maintenance, and Use of Large Knowledge Bases%a,
SAIL AIM-271, Artificial Intelligence Laboratory, Stanford
University, July, 1976.


[3]\Douglas B. Lenat, %bAM: An Artificial Intelligence Approach to
Discovery in Mathematics
as Heuristic Search%a, SAIL AIM-286,  Artificial
Intelligence Laboratory, Stanford University, July,
 1976.  Jointly issued as Computer Science Dept. Report No. STAN-CS-76-570.

[4]\R. D. Laing, "%bRules and Metarules%a",
in uu-%bThe Politics of the Family and Other Essays%a], Vintage
Books, N.Y., 1971, pp. 103-116.

[5]\Herbert Simon and Kenneth Kotovsky, %bHuman Acquisition of
Concepts for Sequential Patterns%a, Psychology Review 70, No. 6,
November, 1963, pp. 534-546.


[6]\Malcolm Pivar and Mark Finkelstein, %bAutomation, using LISP,
of Inductive Inference on Sequences%a, in (Berkeley and Bobrow, eds.)
uu-The Programming Language LISP: Its Operation and Applications],
Information International, Cambridge, Mass., March, 1964.


[7]\C. G. Hempel, %bFundamentals of Concept Formation in Empirical
Science%a, University of Chicago Press, Chicago, 1952.

[8]\Patrick H. Winston, %bLearning Structural Descriptions
from Examples%a, EE TR-76, Project MAC TR-231, MIT AI Lab,
September, 1970.

[9]\Herbert Simon, %bDoes Scientific Discovery Have a Logic?%a, Philosophy of Science,
40, No. 4, December, 1973, pp. 471-480.

[10]\Marvin Minsky, %bFrames%a, in (P. Winston, ed.)
uu-The Psychology of Computer Vision], McGraw Hill, N.Y. 1975.



[11]\Edward A. Feigenbaum, Bruce Buchanan, Joshua Lederberg, %bOn Generality
and Problem Solving: A Case Study Using the DENDRAL Program%a, in (Meltzer
and Michie, eds.) Machine Intelligence 6, 1971, pp. 165-190.

[12]\Inductive determination of the structure
of proteins by automatic processing of
electron density X-ray crystollographic data. (informal communications with
Dr. Robert Engelmore and Prof. Edward Feigenbaum of
Stanford University.)


[13]\Henri Poincare', %bThe Foundations of Science: Science and
Hypothesis, The Value of Science, Science and Method%a, The Science Press,
N.Y. 1929.

[14]\G. Polya, %bMathematics and Plausible Reasoning%a, John Wiley and
Sons, N.Y., (2 vols.) 1954.

[15]\Imre Lakatos, %bProofs and Refutations: The Logic
of Mathematical Discovery%a, Cambridge U. Press, London, 1976.

[16]\Woody W. Bledsoe, %bSplitting and Reduction Heuristics in
Automatic Theorem Proving%a, Artificial Intelligence
2, 1971, p. 73.

[17]\H. Wang, %bToward Mechanical Mathematics%a, IBM J.
Research and Development 4, No. 1, January, 1960, pp. 2-22.

[18]\H. Gelernter, %bRealization of a Geometry-Theorem Proving
Machine%a, in (Feigenbaum and Feldman, eds.) uu-Computers
and Thought], McGraw Hill, N.Y., 1963, pp. 134-152.

[19]\Robert Kling, %bReasoning by Analogy with Applications to Heuristic Probllem
Solving: A Case Study%a, Memo AIM-147, CS Dept. Report CS-216,
Stanford U., August, 1971.

[20]\T. Evans, %bA Program for the Solution of Geometric-Analogy
Intelligence Test Questions%a, in (Minsky, ed.) uu-Semantic Information
Processing]. MIT Press, Cambridge, Mass., 1968, pp. 271-353.

[21]\Donald Knuth, %bSurreal Numbers%a, Addison-Wesley, Reading,Mass., 1974.


[22]\D. Brotz, %bEmbedding Heuristic  Problem Solving Methods
in a Mechanical Theorem Prover%a, Stanford University
Computer Science Dept. Report No. STAN-CS-74-443, August, 1974.

[23]\Arthur Koestler, %bThe Act of Creation%a, Dell Pub., N.Y.,  1967.

[24]\Seymour Papert, %bTeaching Children to be Mathematicians
Versus Teaching About Mathematics%a, Intl. J. Math Ed. Sci. Technology 3,
No. 3, July-Sept., 1972, pp. 249-262.

[25]\Jaques Hadamard,
%bThe Psychology of Invention in the Mathematical Field%a,
Dover Pub., N.Y., 1945.

.end "ref";

.every heading(,,{page});
.back;